Updated: 2021-08-13

The concept of machine learning

 

We hear a lot about “machine learning”

How does it work?

What you will learn

 

  • What is Machine Learning?
  • What are the main categories of Machine Learning?
  • What are some examples of Machine Learning?
  • How does Machine Learning “work”?

What is Machine Learning?

 

One definition: “Machine Learning is the semi-automated extraction of knowledge from data”

  • Knowledge from data: Starts with a question that might be answerable using data
  • Automated extraction: A computer provides the insight
  • Semi-automated: Requires many smart decisions by a human

What are the main categories of Machine Learning?

 

Supervised learning: Making predictions using data

Unsupervised learning: Extracting structure from data

Example of supervised learning - farm practice

Supervised: we know the categories

Supervised: we know the categories

Example of unsupervised learning - what farm types are there?

Unsupervised: We don't know the categories

Unsupervised: We don’t know the categories

How does Machine Learning “work”?

 

High-level steps of supervised learning:

  1. First, train a Machine Learning model using labeled data

    • “Labeled data” has been labeled with the outcome
    • “Machine Learning model” learns the relationship between the attributes of the data and its outcome
  2. Then, make predictions on new data for which the label is unknown

How does Machine Learning “work”?

ML structure and terminology

ML structure and terminology

How does Machine Learning “work”?

 

The primary goal of supervised learning is to build a model that “generalizes”

The goal is to accurately predict the future rather than the past!

Questions about Machine Learning

 

  • How do I choose which attributes of my data to include in the model?
  • How do I choose which model to use?
  • How do I optimize this model for best performance?
  • How do I ensure that I’m building a model that will generalize to unseen data?
  • Can I estimate how well my model is likely to perform on unseen data?

Resources

Live coding

Â